Illumina NovaSeq 6000 paired end sequencing - SRA

ERX3267468: Illumina NovaSeq 6000 paired end sequencing
13 ILLUMINA (Illumina NovaSeq 6000) runs: 721.4M spots, 216.4G bases, 40.9Gb downloads

Submitted by: NYGC

Study: 30X whole genome sequencing coverage of the 2504 Phase 3 1000 Genome samples.

PRJEB31736 • ERP114329 • All experiments • All runs

show Abstracthide Abstract

We sequenced all 2,504 samples from the 1000 Genomes (1KG) Project to a minimum of 30x mean genome coverage. Though a small number of 1KG samples had been sequenced to high coverage previously, we sequenced all samples to depth on the latest technology, providing a unified dataset for the next phase of analyses. We processed these samples using the laboratory processes we have previously used for the CCDG project (with minor modifications). Specifically, we generated PCR-free sequencing libraries using unique dual indices to avoid the index switching phenomenon that occurs and causes low level sequencing data contamination on the Illumina patterned flow cells. We sequenced these samples on the Illumina NovaSeq 6000 sequencing instrument, with 2x150bp reads. We believe this instrument represents the future for WGS with short-read technology, and it was important to sequence the 1KG samples in a format that is consistent with future large scale sequencing projects. Our automated analysis pipeline for whole genome sequencing matches the CCDG and TOPMed recommended best practices. Sequencing reads were aligned to the human reference, hs38DH, using BWA-MEM v0.7.15. Data are further processed using the GATK best-practices (v3.5), which generates VCF files in the 4.2 format. Single nucleotide variants and Indels are called using GATK HaplotypeCaller (v3.5), which generates a single-sample GVCF. Variant Quality Score Recalibration (VQSR) is performed using dbSNP138 so quality metrics for each variant can be used in downstream variant filtering.

Sample: Coriell GM19764

SAMN00004475 • SRS006567 • All experiments • All runs

Organism: Homo sapiens

Library:

Name: NA19764

Instrument: Illumina NovaSeq 6000

Strategy: WGS

Source: GENOMIC

Selection: RANDOM

Layout: PAIRED

Construction protocol: TruSeq DNA PCR-free

Runs: 13 runs, 721.4M spots, 216.4G bases, 40.9Gb

Run	# of Spots	# of Bases	Size	Published
ERR3240093	360,685,727	108.2G	9Gb	2019-03-25
ERR3560568	29,257,359	8.8G	2.6Gb	2019-10-02
ERR3560570	30,077,294	9G	2.7Gb	2019-10-02
ERR3560572	28,709,627	8.6G	2.5Gb	2019-10-02
ERR3560574	29,824,100	8.9G	2.6Gb	2019-10-02
ERR3560576	32,287,539	9.7G	2.9Gb	2019-10-02
ERR3560578	32,523,247	9.8G	2.9Gb	2019-10-02
ERR3560580	31,165,732	9.3G	2.8Gb	2019-10-02
ERR3560582	32,581,666	9.8G	2.9Gb	2019-10-02
ERR3560584	28,397,011	8.5G	2.5Gb	2019-10-02
ERR3560586	28,913,872	8.7G	2.5Gb	2019-10-02
ERR3560588	28,286,452	8.5G	2.5Gb	2019-10-02
ERR3560590	28,661,828	8.6G	2.5Gb	2019-10-02

ID:: 7513888

SRA

Sequence Read Archive

Result Filters

Send to:

Supplemental Content

Related information

Recent activity